Enterprise Database Systems
Data Warehousing with Hadoop
Data Warehousing with Hadoop: HDInsight and Retail Sales Implementation Using Hive
Data Warehousing with Hadoop: Managing Big Data Using HDInsight Hadoop
Data Warehousing with Hadoop: Microsoft Analytics Platform System and Hive
Data Warehousing with Hadoop: Spark, HDInsight and Cluster Management

Data Warehousing with Hadoop: HDInsight and Retail Sales Implementation Using Hive

Course Number:
it_dfdwha_03_enus
Lesson Objectives

Data Warehousing with Hadoop: HDInsight and Retail Sales Implementation Using Hive

  • illustrate various data modeling approaches that are adopted for provisioning data warehousing solutions
  • identify the roles of fact and dimension tables in the dimensional design process
  • recall the essential dimensional design processes and the objectives of the various steps involved
  • depict the essential business use cases of data warehousing in the retail sales domain
  • create dimension tables in Hive
  • create fact tables for retail use cases in Hive
  • load data in the dimension and fact tables
  • identify the essential queries that are used and are required in order to fetch essential data from retail schemas
  • construct and execute queries to get the desired outputs from Hive tables
  • create dashboards in PowerBI from Hive data
  • write Hive query to extract data from dimension and fact tables using Joins

Overview/Description

This course covers the implementation of data warehousing in retail sales. Learners will learn to design and implement data warehousing solutions using Hive and PowerBI on HDInsight.



Target

Prerequisites: none

Data Warehousing with Hadoop: Managing Big Data Using HDInsight Hadoop

Course Number:
it_dfdwha_01_enus
Lesson Objectives

Data Warehousing with Hadoop: Managing Big Data Using HDInsight Hadoop

  • recognize the critical features provided by HDInsight to manage big data
  • list the various essential types of cluster that we can implement with HDInsight
  • recall the various open source components of HDInsight and their roles in managing cluster, data, and jobs
  • demonstrate how to set up Hadoop clusters on Azure HDInsight
  • create Hadoop HDInsight clusters using the Azure Resource Manager template
  • specify the essential capabilities of HDInsight and the various types of storages that we can provision to store data
  • illustrate the critical capabilities afforded by the Azure Management Console
  • create, manage, and monitor HDInsight clusters using the Azure Management Console
  • set up the HDInsight Emulator and use PowerShellto execute essential commands
  • identify the various approaches of programming in HDInsight
  • develop and execute MapReduce programs using cmdlet and Hadoop streaming
  • set up Hadoop clusters on HDInsight and execute MapReduce applications

Overview/Description

Explore the fundamentals of Azure HDInsight and the essential architectural components.



Target

Prerequisites: none

Data Warehousing with Hadoop: Microsoft Analytics Platform System and Hive

Course Number:
it_dfdwha_02_enus
Lesson Objectives

Data Warehousing with Hadoop: Microsoft Analytics Platform System and Hive

  • illustrate capabilities, features, and objectives of the Microsoft Analytics Platform System
  • specify how to manage data using PolyBase and the various essential benefits provided by PolyBase
  • identify the role of parallel data warehousing architecture in Microsoft Analytics Platform System
  • recall the various data exploration architectures that can be implemented using HDInsight and the Microsoft Analytics Platform System
  • describe the role of Hive as a data warehouse system for Hadoop
  • describe the architectural composition of Hive in HDInsight
  • set up the development environment for Hive using the Azure HDInsight tool for VSCode
  • connect and submit queries to HDInsight clusters using VSCode
  • specify the various clauses that can be used in Hive Query Language to manage objects and query data
  • work with Azure PowerShell and Beeline to execute Hive Query Language queries
  • create a database, tables, and load data to Hive tables from the Azure Blob Storage and SQL Servers
  • work with partition tables and manage Hive data formats
  • demonstrate how to install Hue and manage Hive queries from the Hue interface
  • demonstrate the approaches involved in retrieving Hive data and creating visualization on Power BI
  • work with HIVE as an ETL tool
  • compare HBase and Hive from the data modeling perspective
  • create a Hive table and load data from an external SQL Server

Overview/Description

Explore the Microsoft Analytics Platform System and using Hive to manage data from a data warehouse perspective.



Target

Prerequisites: none

Data Warehousing with Hadoop: Spark, HDInsight and Cluster Management

Course Number:
it_dfdwha_04_enus
Lesson Objectives

Data Warehousing with Hadoop: Spark, HDInsight and Cluster Management

  • specify the essential capabilities of Spark and its essential architectural components
  • list the data structures along with the RDD and lineage concepts that are used in Spark
  • set up Spark clusters using PowerShell and Azure Resource Manager template
  • describe the relationship between Spark SQL and Hive
  • specify the essential concepts of Spark SQL and DataFrame
  • demonstrate the approach of customizing HDInsight clusters using bootstrap
  • install Hadoop applications on Azure HDInsight
  • illustrate the usage of Ambari as a tool in order to manage clusters
  • manage Hadoop clusters in HDInsight using Azure CLI
  • specify the approach of troubleshooting and tuning HDInsight clusters
  • monitor Hadoop clusters in HDInsight to collect metrics for analysis
  • set up Spark clusters and manage the clusters using Ambari GUI

Overview/Description

Discover how to work with Spark and its in-memory capabilities of data management. How to manage and troubleshoot HDInsight clusters using Ambari and the Azure CLI tool is also covered.



Target

Prerequisites: none

Close Chat Live